Warm-Starting Nested Rollout Policy Adaptation with Optimal Stopping
نویسندگان
چکیده
Nested Rollout Policy Adaptation (NRPA) is an approach using online learning policies in a nested structure. It has achieved great result variety of difficult combinatorial optimization problems. In this paper, we propose Meta-NRPA, which combines optimal stopping theory with NRPA for warm-starting and significantly improves the performance NRPA. We also present several exploratory techniques enable it to perform better exploration. establish three notoriously problems ranging from telecommunication, transportation coding namely Minimum Congestion Shortest Path Routing, Traveling Salesman Problem Time Windows Snake-in-the-Box. improve lower bounds Snake-in-the-Box problem multiple dimensions.
منابع مشابه
Beam Nested Rollout Policy Adaptation
The Nested Rollout Policy Adaptation algorithm is a tree search algorithm known to be efficient on combinatorial problems. However, one problem of this algorithm is that it can converge to a local optimum and get stuck in it. We propose a modification which limits this behavior and we experiment it on two combinatorial problems for which the Nested Rollout Policy Adaption is known to be good at.
متن کاملNested Rollout Policy Adaptation with Selective Policies
Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enh...
متن کاملImproved Diversity in Nested Rollout Policy Adaptation
For combinatorial search in single-player games nested MonteCarlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yiel...
متن کاملNested Rollout Policy Adaptation for Monte Carlo Tree Search
Monte Carlo tree search (MCTS) methods have had recent success in games, planning, and optimization. MCTS uses results from rollouts to guide search; a rollout is a path that descends the tree with a randomized decision at each ply until reaching a leaf. MCTS results can be strongly influenced by the choice of appropriate policy to bias the rollouts. Most previous work on MCTS uses static unifo...
متن کاملDistributed Nested Rollout Policy for SameGame
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search heuristic for puzzles and other optimisation problems. It achieves state of the art performance on several games including SameGame. In this paper, we design several parallel and distributed NRPA-based search techniques, and we provide number of experimental insights about their execution. Finally, we use our best implementation to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i10.26459